Online-Academy
Look, Read, Understand, Apply

Data Mining And Data Warehousing

Agglomerative hierarchical Clustering

Agglomerative Clustering

Agglomerative Clustering is a type of hierarchical clustering that builds clusters bottom-up by repeatedly merging the closest pairs of clusters until all points are grouped or a stopping condition is met. Unlike K-Means or DBSCAN, it doesn’t require you to specify cluster centers or density. Instead, it uses a linkage criterion to decide which clusters to merge.
Linkage Criteria:
  • Single Linkage: Single linkage uses the minimum distance between points in two clusters. Single linkage calculates the distance between two clusters as the shortest distance between any two points in those clusters. This method can result in long, chain-like clusters, and is sensitive to outliers.
  • Complete Linkage: complete linkage uses the maximum distance between points in two clusters. Complete linkage calculates the distance between two clusters as the longest distance between any two points in those clusters. This method tends to create compact, spherical clusters and is less sensitive to outliers than single linkage.
  • Average Linkage: average linkage uses the average distance between all pairs of points across the two clusters. Average linkage calculates the distance between two clusters as the average of the distances between all pairs of points in the two clusters. This method is considered a middle ground between single and complete linkage and is less affected by outliers than single linkage.
Working of Agglomerative Clustering
  • Start with each data point as its own cluster.
  • Compute the distance matrix between all clusters.
  • Merge the two closest clusters based on the chosen linkage.
  • Repeat steps 2–3 until:
  • Only k clusters remain, or A distance threshold is reached
Dendrogram
A dendrogram is a tree diagram that shows how clusters are merged over iterations. It can be used to:
  • Visualize the hierarchy
  • Choose the number of clusters by “cutting” the tree at a certain height